Wikipedia-based Unsupervised Query Classification
نویسندگان
چکیده
In this paper we present an unsupervised approach to Query Classification. The approach exploits the Wikipedia encyclopedia as a corpus and the statistical distribution of terms, from both the category labels and the query, in order to select an appropriate category. We have created a classifier that works with 55 categories extracted from the search section of the Bridgeman Art Library website. We have also evaluated our approach using the labeled data of the KDD-Cup 2005 Knowledge Discovery and Data Mining competition (800,000 real user queries into 67 target categories) and obtained promising results.
منابع مشابه
BIT and MSRA at TREC KBA CCR Track 2013
Our strategy for TREC KBA CCR track is to first retrieve as many vital or documents as possible and then apply more sophisticated classification and ranking methods to differentiate vital from useful documents. We submitted 10 runs generated by 3 approaches: question expansion, classification and learning to rank. Query expansion is an unsupervised baseline, in which we combine entities’ names ...
متن کاملUnsupervised Synthesis of Multilingual Wikipedia Articles
In this paper, we propose an unsupervised approach to automatically synthesize Wikipedia articles in multiple languages. Taking an existing high-quality version of any entry as content guideline, we extract keywords from it and use the translated keywords to query the monolingual web of the target language. Candidate excerpts or sentences are selected based on an iterative ranking function and ...
متن کاملICL KBP Approaches to Knowledge Base Population at TAC2010
This paper reports the ICL KBP team participated in the TAC2010-Knowledge Base Popolation Track. We submitted results for Entity Linking task and Slot Filling task. For Entity Linking, we implemented a simple unsupervised method to select the candidate entities in the Wikipedia Reference Knowledge Base for the given query document which describes the query name-string. For Slot Filling, we trea...
متن کاملImproving Query Expansion for Image Retrieval via Saliency and Picturability
In this paper, we present a Wikipedia-based approach to query expansion for the task of image retrieval, by combining salient encyclopaedic concepts with the picturability of words. Our model generates the expanded query terms in a definite two-stage process instead of multiple iterative passes, requires no manual feedback, and is completely unsupervised. Preliminary results show that our propo...
متن کاملTowards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کامل